Scheduling distributed multiway spatial join queries: optimization models and algorithms

نویسندگان

چکیده

Multiway spatial joins are a commonly occurring and fundamental type of query for data processing. This article presents models algorithms to schedule this in distributed database systems while attempting strike balance between makespan communication costs. We propose three based on combinatorial optimization methods: the well-known linear relaxation technique rounding solution generated by programming (LP), more sophisticated Lagrangian Relaxation method (LR), as well greedy heuristic (GR) baseline comparison. Our evaluation shows that built using GR consumes, average, 22% processing resources than elaborate constructed via LR method, when scheduling 64 machines. The provided is also, an order magnitude closer optimal compared GR. show Gigabyte-size multiway queries before execution can reduce its time state-of-the-art frameworks do not have capability, significantly amount shuffled network.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cost Models for Join Queries in Spatial Databases

The join query is one of the fundamental operations in Data Base Management Systems (DBMSs). Modern DBMSs should be able to support non-traditional data, including spatial objects, in an efficient manner. Towards this goal, spatial data structures can be adopted in order to support the execution of join queries on sets of multidimensional data. This paper introduces analytical models that estim...

متن کامل

An Effective High-Performance Multiway Spatial Join Algorithm with Spark

Multiway spatial join plays an important role in GIS (Geographic Information Systems) and their applications. With the increase in spatial data volumes, the performance of multiway spatial join has encountered a computation bottleneck in the context of big data. Parallel or distributed computing platforms, such as MapReduce and Spark, are promising for resolving the intensive computing issue. P...

متن کامل

JTop Algorithms for Top-k Join Queries

Top-k join queries have become very important in many important areas of computing. One of the most efficient algorithms for top-k join queries is the Rank-Join algorithm [17] [18]. However, there are many cases where Rank-Join does much unnecessary access to the input data sources. In this report, we first show that there are many cases where Rank-Join's stopping mechanism is not efficient, an...

متن کامل

Search algorithms for multiway spatial joins

This papers deals with multiway spatial joins when (i) there is limited time for query processing and the goal is to retrieve the best possible solutions within this limit (ii) there is unlimited time and the goal is to retrieve a single exact solution, if such a solution exists, or the best approximate one otherwise. The first case is motivated by the high cost of join processing in real-time ...

متن کامل

Data-Parallel Spatial Join Algorithms

E cient data-parallel spatial join algorithms for pmr quadtrees and R-trees, common spatial data structures, are presented. The domain consists of planar line segment data (i.e., Bureau of the Census TIGER/Line les). Parallel algorithms for map intersection and a spatial range query are described. The algorithms are implemented using the SAM (Scan-AndMonotonic-mapping) model of parallel computa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Geographical Information Science

سال: 2023

ISSN: ['1365-8824', '1365-8816']

DOI: https://doi.org/10.1080/13658816.2023.2170380